AITopics | human mesh

Collaborating Authors

human mesh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Hyperbolic Space Learning Method Leveraging Temporal Motion Priors for Human Mesh Recovery

Zhang, Xiang, Wu, Suping, Qiu, Weibin, Jin, Zhaocheng, Yang, Sheng

arXiv.org Artificial IntelligenceOct-22-2025

3D human meshes show a natural hierarchical structure (like torso-limbs-fingers). But existing video-based 3D human mesh recovery methods usually learn mesh features in Euclidean space. It's hard to catch this hierarchical structure accurately. So wrong human meshes are reconstructed. To solve this problem, we propose a hyperbolic space learning method leveraging temporal motion prior for recovering 3D human meshes from videos. First, we design a temporal motion prior extraction module. This module extracts the temporal motion features from the input 3D pose sequences and image feature sequences respectively. Then it combines them into the temporal motion prior. In this way, it can strengthen the ability to express features in the temporal motion dimension. Since data representation in non-Euclidean space has been proved to effectively capture hierarchical relationships in real-world datasets (especially in hyperbolic space), we further design a hyperbolic space optimization learning strategy. This strategy uses the temporal motion prior information to assist learning, and uses 3D pose and pose motion information respectively in the hyperbolic space to optimize and learn the mesh features. Then, we combine the optimized results to get an accurate and smooth human mesh. Besides, to make the optimization learning process of human meshes in hyperbolic space stable and effective, we propose a hyperbolic mesh optimization loss. Extensive experimental results on large publicly available datasets indicate superiority in comparison with most state-of-the-art.

artificial intelligence, hyperbolic space, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.18256

Country: Europe > Switzerland (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Latent-Info and Low-Dimensional Learning for Human Mesh Recovery and Parallel Optimization

Zhang, Xiang, Wu, Suping, Yang, Sheng

arXiv.org Artificial IntelligenceOct-22-2025

Existing 3D human mesh recovery methods often fail to fully exploit the latent information (e.g., human motion, shape alignment), leading to issues with limb misalignment and insufficient local details in the reconstructed human mesh (especially in complex scenes). Furthermore, the performance improvement gained by modelling mesh vertices and pose node interactions using attention mechanisms comes at a high computational cost. To address these issues, we propose a two-stage network for human mesh recovery based on latent information and low dimensional learning. Specifically, the first stage of the network fully excavates global (e.g., the overall shape alignment) and local (e.g., textures, detail) information from the low and high-frequency components of image features and aggregates this information into a hybrid latent frequency domain feature. This strategy effectively extracts latent information. Subsequently, utilizing extracted hybrid latent frequency domain features collaborates to enhance 2D poses to 3D learning. In the second stage, with the assistance of hybrid latent features, we model the interaction learning between the rough 3D human mesh template and the 3D pose, optimizing the pose and shape of the human mesh. Unlike existing mesh pose interaction methods, we design a low-dimensional mesh pose interaction method through dimensionality reduction and parallel optimization that significantly reduces computational costs without sacrificing reconstruction accuracy. Extensive experimental results on large publicly available datasets indicate superiority compared to the most state-of-the-art.

artificial intelligence, information, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.18267

Country: Europe > Switzerland (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Add feedback

Supplementary Materials for S-PIFu: Integrating Parametric Human Models with PIFu for Single-view Clothed Human Reconstruction

Neural Information Processing SystemsAug-15-2025, 17:09:51 GMT

In Figure 1, we show S-PIFu's results when given images of test subjects who wear large clothings (e.g. SMPL-X body, and yet S-PIFu is able reconstruct the human subjects accurately. Pixels that belong to human subject but not to the SMPL-X body act as a natural regularizer that prevents S-PIFu from being overly reliant on estimated SMPL-X meshes to reconstruct clothed human meshes. This happens because these pixels only have valid values for the RGB channels and not the channels of our 2D feature maps (i.e. C, B, and N. Recall that C refers to coordinate In Figure 1, we observe what would happen if we feed a noisy SMPL-X mesh (i.e. a SMPL-X mesh SMPL-X mesh's arms (both arms).

information, s-pifu, smpl-x mesh, (14 more...)

Neural Information Processing Systems

Country: Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Synergistic Global-space Camera and Human Reconstruction from Videos

Zhao, Yizhou, Wang, Tuanfeng Y., Raj, Bhiksha, Xu, Min, Yang, Jimei, Huang, Chun-Hao Paul

arXiv.org Artificial IntelligenceMay-23-2024

Remarkable strides have been made in reconstructing static scenes or human bodies from monocular videos. Yet, the two problems have largely been approached independently, without much synergy. Most visual SLAM methods can only reconstruct camera trajectories and scene structures up to scale, while most HMR methods reconstruct human meshes in metric scale but fall short in reasoning with cameras and scenes. This work introduces Synergistic Camera and Human Reconstruction (SynCHMR) to marry the best of both worlds. Specifically, we design Human-aware Metric SLAM to reconstruct metric-scale camera poses and scene point clouds using camera-frame HMR as a strong prior, addressing depth, scale, and dynamic ambiguities. Conditioning on the dense scene recovered, we further learn a Scene-aware SMPL Denoiser to enhance world-frame HMR by incorporating spatio-temporal coherency and dynamic scene constraints. Together, they lead to consistent reconstructions of camera trajectories, human meshes, and dense scene point clouds in a common world frame. Project page: https://paulchhuang.github.io/synchmr

computer vision, pattern recognition, point cloud, (13 more...)

arXiv.org Artificial Intelligence

2405.14855

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine (0.49)
Media (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing (0.93)

Add feedback

Co-Evolution of Pose and Mesh for 3D Human Body Estimation from Video

You, Yingxuan, Liu, Hong, Wang, Ti, Li, Wenhao, Ding, Runwei, Li, Xia

arXiv.org Artificial IntelligenceAug-20-2023

Despite significant progress in single image-based 3D human mesh recovery, accurately and smoothly recovering 3D human motion from a video remains challenging. Existing video-based methods generally recover human mesh by estimating the complex pose and shape parameters from coupled image features, whose high complexity and low representation ability often result in inconsistent pose motion and limited shape patterns. To alleviate this issue, we introduce 3D pose as the intermediary and propose a Pose and Mesh Co-Evolution network (PMCE) that decouples this task into two parts: 1) video-based 3D human pose estimation and 2) mesh vertices regression from the estimated 3D pose and temporal image feature. Specifically, we propose a two-stream encoder that estimates mid-frame 3D pose and extracts a temporal image feature from the input image sequence. In addition, we design a co-evolution decoder that performs pose and mesh interactions with the image-guided Adaptive Layer Normalization (AdaLN) to make pose and mesh fit the human body shape. Extensive experiments demonstrate that the proposed PMCE outperforms previous state-of-the-art methods in terms of both per-frame accuracy and temporal consistency on three benchmark datasets: 3DPW, Human3.6M, and MPI-INF-3DHP. Our code is available at https://github.com/kasvii/PMCE.

artificial intelligence, estimation, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2308.10305

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.72)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

POTTER: Pooling Attention Transformer for Efficient Human Mesh Recovery

Zheng, Ce, Liu, Xianpeng, Qi, Guo-Jun, Chen, Chen

arXiv.org Artificial IntelligenceMar-23-2023

Transformer architectures have achieved SOTA performance on the human mesh recovery (HMR) from monocular images. However, the performance gain has come at the cost of substantial memory and computational overhead. A lightweight and efficient model to reconstruct accurate human mesh is needed for real-world applications. In this paper, we propose a pure transformer architecture named POoling aTtention TransformER (POTTER) for the HMR task from single images. Observing that the conventional attention module is memory and computationally expensive, we propose an efficient pooling attention module, which significantly reduces the memory and computational cost without sacrificing performance. Furthermore, we design a new transformer architecture by integrating a High-Resolution (HR) stream for the HMR task. The high-resolution local and global features from the HR stream can be utilized for recovering more accurate human mesh. Our POTTER outperforms the SOTA method METRO by only requiring 7% of total parameters and 14% of the Multiply-Accumulate Operations on the Human3.6M (PA-MPJPE metric) and 3DPW (all three metrics) datasets. The project webpage is https://zczcwh.github.io/potter_page.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2303.13357

Country:

North America > United States > North Carolina (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

IntegratedPIFu: Integrated Pixel Aligned Implicit Function for Single-view Human Reconstruction

Chan, Kennard Yanting, Lin, Guosheng, Zhao, Haiyu, Lin, Weisi

arXiv.org Artificial IntelligenceNov-15-2022

We propose IntegratedPIFu, a new pixel-aligned implicit model that builds on the foundation set by PIFuHD. IntegratedPIFu shows how depth and human parsing information can be predicted and capitalized upon in a pixel-aligned implicit model. In addition, IntegratedPIFu introduces depth-oriented sampling, a novel training scheme that improve any pixel-aligned implicit model's ability to reconstruct important human features without noisy artefacts. Lastly, IntegratedPIFu presents a new architecture that, despite using less model parameters than PIFuHD, is able to improves the structural correctness of reconstructed meshes. Our results show that IntegratedPIFu significantly outperforms existing state-of-the-arts methods on single-view human reconstruction. We provide the code in our supplementary materials.

artificial intelligence, machine learning, pixel-aligned implicit model, (14 more...)

arXiv.org Artificial Intelligence

2211.07955

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
North America > United States > Ohio > Montgomery County > Dayton (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Singapore (0.04)

Genre:

Research Report > New Finding (0.54)
Research Report > Promising Solution (0.34)

Industry: Health & Medicine (0.70)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes

Youwang, Kim, Ji-Yeon, Kim, Oh, Tae-Hyun

arXiv.org Artificial IntelligenceJul-21-2022

We propose CLIP-Actor, a text-driven motion recommendation and neural mesh stylization system for human mesh animation. CLIP-Actor animates a 3D human mesh to conform to a text prompt by recommending a motion sequence and optimizing mesh style attributes. We build a text-driven human motion recommendation system by leveraging a large-scale human motion dataset with language labels. Given a natural language prompt, CLIP-Actor suggests a text-conforming human motion in a coarse-to-fine manner. Then, our novel zero-shot neural style optimization detailizes and texturizes the recommended mesh sequence to conform to the prompt in a temporally-consistent and pose-agnostic manner. This is distinctive in that prior work fails to generate plausible results when the pose of an artist-designed mesh does not conform to the text from the beginning. We further propose the spatio-temporal view augmentation and mask-weighted embedding attention, which stabilize the optimization process by leveraging multi-frame human motion and rejecting poorly rendered views. We demonstrate that CLIP-Actor produces plausible and human-recognizable style 3D human mesh in motion with detailed geometry and texture solely from a natural language prompt.

clip-actor, computer vision, mesh, (14 more...)

arXiv.org Artificial Intelligence

2206.04382

Country: Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.05)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Learning Transferable 3D Adversarial Cloaks for Deep Trained Detectors

Maesumi, Arman, Zhu, Mingkang, Wang, Yi, Chen, Tianlong, Wang, Zhangyang, Bajaj, Chandrajit

arXiv.org Artificial IntelligenceApr-22-2021

This paper presents a novel patch-based adversarial attack pipeline that trains adversarial patches on 3D human meshes. We sample triangular faces on a reference human mesh, and create an adversarial texture atlas over those faces. The adversarial texture is transferred to human meshes in various poses, which are rendered onto a collection of real-world background images. Contrary to the traditional patch-based adversarial attacks, where prior work attempts to fool trained object detectors using appended adversarial patches, this new form of attack is mapped into the 3D object world and back-propagated to the texture atlas through differentiable rendering. As such, the adversarial patch is trained under deformation consistent with real-world materials. In addition, and unlike existing adversarial patches, our new 3D adversarial patch is shown to fool state-of-the-art deep object detectors robustly under varying views, potentially leading to an attacking scheme that is persistently strong in the physical world.

adversarial patch, human mesh, mesh, (14 more...)

arXiv.org Artificial Intelligence

2104.11101

Country:

North America > United States > Texas > Travis County > Austin (0.05)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (0.89)
Government > Military (0.89)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Security & Privacy (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Filters

Collaborating Authors

human mesh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

6f32db03ef5211f66101ec5972ea9da5-Supplemental-Conference.pdf

Hyperbolic Space Learning Method Leveraging Temporal Motion Priors for Human Mesh Recovery

Latent-Info and Low-Dimensional Learning for Human Mesh Recovery and Parallel Optimization

Supplementary Materials for S-PIFu: Integrating Parametric Human Models with PIFu for Single-view Clothed Human Reconstruction

Synergistic Global-space Camera and Human Reconstruction from Videos

Co-Evolution of Pose and Mesh for 3D Human Body Estimation from Video

POTTER: Pooling Attention Transformer for Efficient Human Mesh Recovery

IntegratedPIFu: Integrated Pixel Aligned Implicit Function for Single-view Human Reconstruction

CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes

Learning Transferable 3D Adversarial Cloaks for Deep Trained Detectors